Fast low-power shared division and square-root architecture

نویسندگان

  • Martin Kuhlmann
  • Keshab K. Parhi
چکیده

This paper addresses a fast low-power implementation of a shared division and square-root architecture. Two approaches are considered in this paper; these include the SRT (Sweeney, Robertson and Tocher) approach which does not require prescaling and the GST (generalized Svoboda and Tung) approach which requires prescaling of the operands. This paper makes two important contributions. Although SRT division and square-root approaches and GST division approach have been known for long time, square-root architectures based on the GST approach have not been proposed so far. This paper, for the first time, develops a GST square-root architecture without requiring an additional division by the scaling factor after the squareroot operation. Although various divider and square-root architectures have been compared with respect to speed, no tradeoffs with respect to power consumption of these architectures have been studied so far. Quantitative comparison of speed and power consumption of GST and SRT division/square-root units is the second main contribution of the paper. Shared divider and square-root units are designed based on the SRT and the GST approaches, in both minimally and maximally redundant radix-4 representations. Simulations demonstrate that the worst-case overall latency of the minimally-redundant GST architecture is 35% smaller compared to the SRT. Alternatively, for a fixed latency, the minimally-redundant GST architecture based division and square-root operations consume 42% and 20% less power, respectively, compared to the maximally-redundant SRT approach.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Low-Power Radix-4 Combined Division and Square Root

Because of the similarities in the algorithm it is quite common to implement division and square root in the same unit. The purpose of this work is to implement a low-power combined radix-4 division and square root floating-point double precision unit and to compare its performance and energy consumption with a radix-4 division only unit. Previous work has been done on reducing the energy dissi...

متن کامل

Floating Point Division and Square Root Algorithms and Implementation in the AMD-K7 Microprocessor

This paper presents the AMD-K7 IEEE 754 and x87 compliant floating point division and square root algorithms and implementation. The AMD-K7 processor employs an iterative implementation of a series expansion to converge quadratically to the quotient and square root. Highly accurate initial approximations and a high performance shared floating point multiplier assist in achieving low division an...

متن کامل

Low Power Division and Square Root

of the Dissertation Low Power Division and Square Root by Alberto Nannarelli Doctor of Philosophy in Engineering University of California, Irvine, 1999 Professor Tomás Lang, Chair The general objective of our work is to develop methods to reduce the energy consumption of arithmetic modules while maintaining the delay unchanged and keeping the increase in the area to a minimum. Here, we present ...

متن کامل

Designing Of Modified Area Efficient Square Root Carry Select Adder(SQRT CSLA)

In the design of Integrated Circuits, The necessity of portable systems is increasing an area occupancy plays a vital role. Square Root Carry Select Adder (SQRT CSLA) is one of the fastest adders which is used in this data-processing processor to perform fast arithmetic functions. In this paper, an area-efficient square root carry select adder(SQRT CSLA design) by sharing Common Boolean logic t...

متن کامل

New Embedded Memory Architecture for Enhanced Yield, Performance and Power Consumption

A new Cross-Shared Redundancy (CSR) architecture of embedded memory for yield improvement is proposed. The model of CSR takes into account cluster errors, which are common for deep-submicron technologies. The redundancy scheme is optimized in consideration of low-power and fast operation. A yield model of cross-shared redundancy for the embedded memory is presented.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998